Speaker Identification for Swiss German with Spectral and Rhythm Features

نویسندگان

Athanasios Lykartsis

Stefan Weinzierl

Volker Dellwo

چکیده

We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the performance of state-of-the-art but simple-to-extract frame-based features. The paper focus is the evaluation on one corpus (swiss german, TEVOID) using support vector machines. Results suggest that the general spectral features can provide very good performance on this dataset, whereas the rhythm features are not as successful in the task, indicating either the lack of suitability for this task or the dataset specificity. DOI: https://doi.org/10.17743/aesconf.2017.978-1-942220-15-2 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-138208 Published Version Originally published at: Lykartsis, Athanasios; Weinzierl, Stefan; Dellwo, Volker (2017). Speaker Identification for Swiss German with Spectral and Rhythm Features. In: 2017 AES International Conference on Semantic Audio (June 2017), Erlangen, June 2017 June 2017. DOI: https://doi.org/10.17743/aesconf.2017.978-1-942220-15-2 Audio Engineering Society Conference Paper Presented at the Conference on Semantic Audio 2017 June 22 – 24, Erlangen, Germany This paper was peer-reviewed as a complete manuscript for presentation at this conference. This paper is available in the AES E-Library (http://www.aes.org/e-lib) all rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society. Speaker Identification for Swiss German with Spectral and Rhythm Features Athanasios Lykartsis1, Stefan Weinzierl1, and Volker Dellwo2 1Audio Communication Group, Technische Universität Berlin, Germany 2Phonetics Laboratory, Universität Zürich, Switzerland Correspondence should be addressed to Athanasios Lykartsis ([email protected]) ABSTRACT We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the performance of state-of-the-art but simple-to-extract frame-based features. The paper focus is the evaluation on one corpus (swiss german, TEVOID) using support vector machines. Results suggest that the general spectral features can provide very good performance on this dataset, whereas the rhythm features are not as successful in the task, indicating either the lack of suitability for this task or the dataset specificity.We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the performance of state-of-the-art but simple-to-extract frame-based features. The paper focus is the evaluation on one corpus (swiss german, TEVOID) using support vector machines. Results suggest that the general spectral features can provide very good performance on this dataset, whereas the rhythm features are not as successful in the task, indicating either the lack of suitability for this task or the dataset specificity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors.

Between-speaker variability of acoustically measurable speech rhythm [%V, ΔV(ln), ΔC(ln), and Δpeak(ln)] was investigated when within-speaker variability of (a) articulation rate and (b) linguistic structural characteristics was introduced. To study (a), 12 speakers of Standard German read seven lexically identical sentences under five different intended tempo conditions (very slow, slow, norma...

متن کامل

Perception of levels of emotion in prosody

Prosody conveys information about the emotional state of the speaker. In this study we test whether listeners are able to detect different levels in the emotional state of the speaker based on prosodic features such as intonation, speech rate and intensity. We ran a perception experiment in which we ask Swiss German and Chinese listeners to recognize the intended emotions that the professional ...

متن کامل

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.

متن کامل

Automatic Identification of Gender from Speech

Identifying the gender of a speaker from speech has a variety of applications ranging from speech analytics to personalizing human-machine interactions. While gender identification in previous work has explored the use of the statistical properties of the speaker’s pitch features, in this paper, we explore the impact of using spectral features in conjunction with pitch features on identifying g...

متن کامل

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Speaker Identification for Swiss German with Spectral and Rhythm Features

نویسندگان

چکیده

منابع مشابه

Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors.

Perception of levels of emotion in prosody

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Automatic Identification of Gender from Speech

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

عنوان ژورنال:

اشتراک گذاری